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Abstract 

This paper describes a heap construction that supports insert and delete operations 
in arbitrary (possibly illegitimate) states. After any sequence of at most 0(m) heap 
operations, the heap state is guaranteed to be legitimate, where m is the initial number 
of items in the heap. The response from each operation is consistent with its effect on 
the data structure, even for illegitimate states. The time complexity of each operation is 
0(lg K) where K is the capacity of the data structure; when the heap's state is legitimate 
the time complexity is O(lgn) for n equal to the number items in the heap. 

Keywords: data structures, fault tolerance, recovery, self-stabilization 



1 Introduction 

Increased visibility of systems emphasizes the theme of system availability. When availabil- 
ity is not important, systems may handle failures by stopping normal system activity and 
restoring damaged data from a recent backup copy. But stopping and restoring from a backup 
interferes with system availability, and in many instances it is preferable to let system services 
continue, even if the behavior of the services show temporary inconsistencies. Bookkeeping 
is intrinsic to system implementation, so data structures are found at the core of system 
programs. The objective of system availability motivates research of data structures with 
availability properties. 

(Self-) stabilization is the paradigm usually associated with recovery from transient faults 
of unlimited scope Stabilization is traditionally associated with a distributed system, 
where each process can perpetually check and repair its variables. Our work departs from 
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this traditional setting: we investigate a sequential, non-distributed data structure, supposing 
that the data structure is managed only by standard methods. We further suppose that each 
method invocation starts cleanly (with no transient damage to internal or control variables of 
that invocation), which resembles other work on fault tolerance |5|. The heap proposed 
here also constrains operation behavior during the period of convergence to a legitimate state 
— an issue not usually addressed by stabilization research (papers |2|, |6| are the exception). 

2 Stabilizing Heap Construction 

The data structure presented below is a variant of the standard binary heap ^ with a 
maximum capacity of K items. Two operations are defined for the heap, insert(p), which 
inserts value p into the heap, and deleteMin( ), which returns and removes an item of least 
value from the heap. We say that an insert(p) succeeds if it inserts item p into the heap 
and fails otherwise. The response to an insert(p) invocation indicates success or failure by 
returning "ack" for success and "heap full" for failure. Similarly, a deleteMin( ) succeeds if 
it returns an item and fails by returning a "heap empty" indication. 

The heap construction described here is based on a binary, balanced tree of K nodes, rooted 
at a node named root, and denoted by A. Each node x in tree A has two associated constants 
x.left and x. right, which refer respectively to the left and right child of x in A. The symbol 
A denotes the absence a child: x.left = A (x. right = A) indicates that x has no left (right) 
child in A. We suppose that x.left and x. right cannot be corrupted by a transient fault. (A 
conventional heap implementation by an array Q satisfies this assumption, since there is 
a static mapping between parent and child.) Each node x has three variable fields, x.val, 
x. height, and x.nextslot used for storage and management of heap items in the tree. 

The field x.val may contain a heap item for node x. We use the symbol oo, which is a 
value outside the domain of possible heap items, to indicate the absence of a heap item at 
a particular node. For convenience, let X.val = oo and let y.val < oo hold by definition 
for all y E A. We define tree Ta to be a truncation of tree A that includes only nodes in 
a path of non-oo values. Formally, for any node x, let x € Ta iff x.val ^ oo and either 
X is root or the child, with respect to A, of some y satisfying y £ Ta (the definition is 
recursive). It follows that Ta is empty if root.val = oo. A node x G Ta is a leaf of Ta iff 
{x.left).val = {x. right). val = oo. The expression (Ta) denotes the bag (multiset) given by 
{x.val \ X £ Ta}, and \Ta\ is the number of nodes in tree Ta. 

Implementation of a binary heap should satisfy the balance property and the heap property. 
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The balance property is that any heap of m items is contained in a tree T4 with height 
O(lgm). The heap property holds at x € Ta iff x.val < y.val for any y child of x. The data 
structure satisfies the heap property iff the heap property holds at every x, x € Ta- 

Legitimate State. The fields of ^'s nodes determine whether or not the tree is in a 
legitimate state. The state of A is legitimate iff (i) for every x G Ta, the heap property holds; 
(ii) the balance property holds; (Hi) for every x G Ta, x. height is the height of the subtree (in 
Ta) rooted at x; and (iv) for every x E Ta, x.nextslot equals the minimum distance from x to 
a descendant y eTa such that y has fewer children in Ta than A, and if no such descendant 
y exists, x.nextslot can have any value satisfying x.nextslot > K. 

Basic Operations. Because conventional heap operations are well known, we provide only 
sketches of the operation logic. Two internal routines deepLeaf and f indSlot assist in node 
allocation and tree maintenance. Let deepLeaf (root) be a recursive procedure that locates a 
node of Ta having maximum depth: deepLeaf (x) compares the height fields of x's children, 
and if one of them, say y, has greater height, then deepLeaf (x) returns deepLeaf (y); and 
if both children have equal height fields, then deepLeaf (x) returns deepLeaf (y) for some 
(possibly nondeterministic choice of) y G {x. left, x. right}; deepLeaf (x) returns x if x has no 
children. A call to deepLeaf (rooi) returns the symbol A if Ta is empty. 

Let f indSlot(roo^) be a recursive procedure locating a minimum-depth node y in A such 
that y ^ Ta: if x has one child y in ^ such that y ^ Ta, then f indSlot(x) returns y; if x has 
two children in A such that neither is in Ta, then f indSlot(x) returns an arbitrary child of 
x; and if x has two children in Ta, f indSlot(x) compares the nextslot fields of x's children 
and returns f indSlot(y) for the child y that has a smaller nextslot field (if both children have 
equal nextslot fields, then y can be any child of x.) An invocation of findSlot (rooi) may fail 
to return an element y, and instead will return symbol A, if no y ^ can be located. 

The implementation of insert(^?) consists of assigning y := f indSlot(rooi), and if y = A, 
then insert(p) responds with a "heap full" indication; otherwise, the following sequence 
executes. First, the operation assigns y. vol, y. height, y. nextslot := p,0,t where t = if y 
has a child in A and otherwise t = K. Second, z.val := 00 is assigned for every child z 
of y in A. Third, the operation calls upHeapif y(y). The upHeapify routine swaps values 
of items on a path from y to root, until each parent has a value at most that of its child 
in the path. As upHeapify traverses the path from leaf to root, it also enforces, for each 
item y on the path, y. height := 1 + max.{(y. left). height, {y. right). height) and y.nextslot := 
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l + nim{{y.left).nextslot, {y. right). nextslot) (with appropriate adjustments to these expressions 
for cases of single or no children). 

The implementation of deleteMin( ) consists of y := deepLeaf (rooi), and if y 7^ A, saving 
root.val for the response, then assigning root.val := y.val and y.val := 00, and then calling 
downHeapif y(root). Along the path from y to root, the height and nextslot fields arc also 
recomputed owing to the deletion of y. The downHeapify routine swaps values as needed, 
along some path from root to a leaf, so that the heap property is restored to Ta- 

Active Tree. If A is not in a legitimate state, it is still possible to consider the maximum 
fragment of A that enjoys the heap property. The active tree Sa is defined recursively by: if 
root.val = 00 then Sa is empty, otherwise root G Sa] and if x e Sa and y is a child of x (with 
respect to A) such that y.val / 00 and y.val > x.val, then y G Sa- 

Operation Modifications for Stabilization. The definition of a legitimate state implies, 
for every x G Sa and y ^ Sa where y is a child of x in A, that the val field of y is 00. In 
an illegitimate state, this condition need not hold, though Sa is defined even for illegitimate 
states. Our first modification of operations (including internal routines such as deepLeaf 
and f indSlot) is the following: whenever an item x e Sa^s encountered, it is first examined 
to verify that every child y satisfies y.val > x.val, and if this is not the case, then y.val := co 
is immediately assigned. (The only exception to this modification are the heapify routines, 
which are expected to encounter value reversals along a particular path.) The result of this 
modification is that operations consider children and leaves with respect to Sa rather than 
Ta- Observe that if a sequence of heap operations could somehow encounter all the nodes of 
Ta, then Ta = Sa would hold as a result. A second modification introduced below does this, 
enforcing a scan of sufficiently many nodes of Ta over any sequence of heap operations so that 
Ta = Sa will hold. We call this modification "truncation" since it removes nodes from Ta to 
enforce the heap property. The truncation modification is a convenience for our presentation 
— another possibility would be to treat y.val as equivalent to cxd whenever y.val < x.val for 
X the parent of y, and adjusting the definition of legitimate state (and T4) accordingly. 

The following lemma considers an operation applied to A in an arbitrary (possibly illegiti- 
mate) state; for this lemma, T denotes Sa prior to the operation and T' denotes Sa after the 
operation. 

Lemma 1 An operation applied to A in an arbitrary state satisfies: if insert (p) succeeds, 
then {T') = (T) U {p}; if insert(p) fails, then {T') = (T); a deleteMin operations fails iff 
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T is empty; if deleteMin returns value q, then q = min(T) and {T') = (T) \ {q}; and any 
operation completes in 0{lgK) time. 



Proof: Although findSlot (rooi) does not guarantee to find an available position for an 
insert (p) operation for illegitimate A, if findSlot (rooi) does return r ^ X, then from the 
logic of f indSlot, r is a child of some node of T, and thus T' will contain p as a result of 
insert (p). For a deleteMin operation, deepLeaf (root) returns some leaf of T (not necessarily 
at greatest depth in T) provided T is nonempty, so deleteMin returns root.val of T. The 
O(lgi^) time bound is satisfied because any path from root to leaf in T has length at most 
Igi^. □ 

The accuracy of height and nextslot fields is critical for maintaining the balance condition 
and locating an available node for heap insertion. In an illegitimate state, these fields have 
arbitrary values. Although heap operations recompute height and nextslot fields, such re- 
computation is limited to paths selected by the operations. The second change we make to 
operations is to add calls to a new routine verify. Each application of verify works on 
three objectives: [1) to apply truncation along one path P from root to a leaf of Sa, {2) 
to assign height and nextslot fields from leaf to root in P, and (5) to modify fields so that 
the next invocation of verify will select a path different from P. To support objective (5), 
we add a new binary field toggle, with domain r}, to every node. The path P chosen by 
verify is obtained by following toggle directions from root until a leaf of Sa is reached. Our 
intent is that 0(|5'a|) successive invocations of verify will visit all nodes of the active tree. 

Figure |l| shows routine verify. Objectives {1) and {2) of verify are achieved with straight- 
forward calculations. Although not shown explicitly in the figure, verify first checks values 
and assigns oo if necessary to enforce the heap property, as needed for the truncation proce- 
dure. The implementation of objective (5) is more complicated, using subordinate routines 
nextPath, leftFringe, and swAncestor. 

The first few lines of verify in Figure |l| assign x. toggle for the case of x having fewer than 
two children: in case x has only a single child, then x. toggle should be r or £ according to the 
location of its only child. In case x has no children, x. toggle is assigned £. The setup for a 
new path occurs by call to nextPath(x), which only occurs when x is a leaf of the active tree. 
The idea of nextPath(a;) is to locate an ancestor w of x with w. toggle = I and change w. toggle 
to r, thereby setting up the "next path" for verify to examine. The routine leftFringe 
ensures that whenever such a change of w. toggle from i to r takes place, the toggle fields are 
such that the leftmost path of the subtree rooted at w. right will be selected for this "next 
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path" of verify. All lines but the last of leftFringe consider degenerate cases (single or 
no children). Not shown in the figure is the swAncestor procedure: swAncestor(x) returns 
the nearest ancestor w x such that w has two children in Sa and w. toggle = £; if no such 
ancestor w exists, then swAncestor(a;) returns A. Observe that if swAncestor does return A, 
then nextPath(x) sets up the "next path" to be the leftmost path starting from root in Sa- 

Lemma 2 // [(15*^1 + l)/2j successive invocations of verify(root) are applied to arbitrary 
A, then as a result Ta = Sa and properties (i), {Hi) and {iv) hold. 

Proof: Let P denote the path of nodes examined via recursion for a given invocation of 
verif y(rooi). Observe that lef tFringe(rooi) is called within this invocation of verify iff 
P is the rightmost path within Sa (otherwise swAncestor would return a non-A value). By 
construction, lef tFringe(x) for x ^ root sets up the leftmost path in Sa to the right of P. 
Let T-r be a subtree of Sa-, rooted at x, with m leaves. If m successive verify invocations 
examine the leaves of T-r, then (z), (m) and (iw) hold for the nodes of afterwards (this can 
be shown by induction on subtree height). Let 5 be a preorder listing of the leaves of Sa\ 
S has at most [(15*^1 + l)/2j items since any binary tree of n items has at most (n + l)/2 
leaves. It is straightforward to show that any \S\ successive invocations of verify (rooi) visit 
the leaves of Sa in an order corresponding to some rotation of sequence S. □ 

The remaining modification to standard heap operations consists of having each insert and 
deleteMin operation begin with "verif y(rooi) ; balance(root)" . A balance(root) invoca- 
tion consists of deleting a leaf r found by deepLeaf (root) from the heap and then reinserting 
r into the heap. When all height and nextslot fields in the active tree have legitimate val- 
ues, the effect of balance(roo^) is to move an item from a position of maximum depth to a 
position of minimum depth. Since these fields could be illegitimate, care must be taken in 
the implementation of balance so that deleted leaf r is, in any case, reinserted into the heap 
(perhaps by reversing the delete of r if reinsertion fails). To see why the balance(rooi) call 
is needed, consider an initial active tree with height k that has only one leaf. Without the 
balance(rooi) call, if only insert operations are applied to the heap, then 0{2'^) operations 
would be required to bring the tree into balance. 

Lemma 3 Let m = \Sa\ for an arbitrary initial state of A. After any sequence of at most 
m + 1 operations, A satisfies properties (i), (Hi) and (iv) at all subsequent states. 
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verif y(x) 

if x.left = A then x. toggle := r 
if X. right = A then x. toggle := i 
if -ileaf(x) then 

if X. toggle = r then ver if y{x. right) else verif y(x./e/t) 
else nextPath(x) 

calculate & assign x. height, x.nextslot 



nextPath(x) 

w := swAncestor(x) 

if w ^ X then w. toggle := r ; leftFzinge{w. right) ; return 
else lef tFringe(rooi) 



lef tFringe(x) 

if X = A V leaf (x) return 

if x.left = A then x. toggle := r ; leftFzinge{x. right) ; return 
if X. right = A then x. toggle := i ; lef tFringe(x.Ze/t) ; return 
X. toggle := i ; lef tFringe(xJe/i) ; return 

Figure 1: verify(a;) and subordinate procedures. 

Proof: Each insert and deleteMin operation invokes verif y(rooi), however such opera- 
tions also change the active tree. With respect to the sequence of verif y(rooi) calls starting 
from the initial state, each change to the active tree either occurs in a subtree previously 
visited by verify or occurs in a subtree not yet visited by verify. In the former case, the 
tree modification satisfies (i), (Hi) and (iv) along the path from root to the modified nodes. 
In the latter case, a future verify establishes the desired properties. After d operations, the 
active tree has at most (m + d + l)/2 leaves; since each operation invokes verif y(root), d 
operations visit all leaves provided (m + d + l)/2 < d, by an argument similar to the proof 
of Lemma ^. Therefore m -\- 1 < d suffices. □ 

Lemma 4 Let m = \Sa\ for an arbitrary initial state of A. After at most 0{m) heap 
operations, all subsequent states of A are legitimate. 

Proof: Lemma ^ establishes that properties (i), {Hi) and {iv) hold after 0{m) operations. 
In the rest of this proof, we assume that properties (i), (Hi) and (iv) hold. For the sake of 
generality we suppose that A is only loosely balanced: assume there exist constants a and 



7 



b so that for any t, < t < K , the minimum height ht taken over all subtrees of A with t 
nodes that contain node root satisfies ht < a + blgt. From this assumption, it follows that 
any subtree T of A rooted at root with height exceeding ht is nonoptimal; furthermore, it 
follows that there is some node w € A not contained in T, that is a child of some node of T, 
so that w has depth at most ht. 

We define gap{a) for any state a to be a variant function. 



It is straightforward to show that once gap is zero, any subsequent operation application 
results in zero gap, and that zero gap implies balance. If the initial gap is some value 5 > 0, 
any new item inserted into the heap is placed at minimum depth, any deleteMin removes 
a node at maximum depth, so gap does not increase by the insert or delete operations. 
Moreover, every operation invokes balance(rooi), which decreases positive gap by at least 
one, so within g = 0{m) operations, property {ii) is established. □ 

3 Availability and Stabilization 

The previous section presents a heap construction that is stabilizing (Lemma ^) and also 
satisfies certain properties expected of operations even when the data structure's state is 
illegitimate (Lemma ||). Lemmas || and |^ depend on the definition of Sa- Is there a char- 
acterization of availability and stabilization independent of implementation specifics such as 
Sa^- Such a characterization could be adapted to specify availability and stabilization for 
general types and implementations of data structures. 

Let Ti be an infinite history of operations on a heap, that is, 7^ is a sequence of insert and 
deleteMin invocations and corresponding responses. We characterize a heap implementation 
in terms of properties of all possible operation histories, first for the case of an initially empty 
heap. Let t denote a point either before any operation or between operations in 7i. If t is 
before any operation, define Ct = 0, otherwise let Ct = It\ Dt, where It is the bag of items 
successfully inserted prior to point t, and Dt is the bag of items returned by deleteMin 
operations prior to point t (recall that success or failure of an operation is judged by the 
response it returns). We call Ct the heap content at point t. Heap operations satisfy the 
following constraints: (a) a deleteMin operation immediately following any point t fails iff 
Ct = 0, and otherwise returns min(Ci); (6) an insert operation immediately following any 
point t fails iff |Cf| = K, and otherwise returns "ack"; (c) the running time of any operation 



gap{a) = ^ v{x), where v{x) 

xGSa 



( 



1 if depth(x) > h^s 
otherwise 
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immediately following t is 0{lg\Ct\)- From (a)-(c) one can show the usual heap properties, 
for instance, no deleteMin returns an item not previously inserted. 

The above characterization of heap behavior by history depends on = for the initial point 
t. Availability is a relaxation of this characterization to allow arbitrary initial heap content. 
Let V denote a history fragment starting from an initially empty heap, that consists of at 
most K successful insert operations. To specify behavior of Ti for an arbitrary initial heap, 
let Ti' = V o TL (where o denotes catenation). A heap implementation is available if, for each 
history Ti of operations, there exists V such that Ti' satisfies constraint (a), each operation in 
Ti' has 0(lg K) running time, and any insert operation following any point t fails if |Cj | = K 
(but is allowed to fail even if \Ct\ 7^ K). The construction of Section ^ satisfies availability, 
as shown by Lemma |l|, by choosing P to be a sequence of insert operations for the items of 
the active tree at the initial state. The simplest heap implementation satisfying availability 
is one that returns a failing response to every operation (V is empty and the heap content is 
continuously empty in this case). 

Stabilization is also a weakening of (a)-(c). Let Tit denote the suffix of history 7i following 
point t. A heap implementation is stabilizing if, for each history 7i of operations, there exists 
V (a history fragment of successful insert operations) and a point t such that (a)-(c) hold 
for VoTCf. We call the history prefix C satisfying Tl = CoTit the convergence period of 7i. The 
construction of Section |2| is stabilizing by choosing V to contain insert operations for the 
active tree at some point t that exists by Lemma ^ The definition of stabilization permits 
operations to arbitrarily succeed or fail during the convergence period, and a deleteMin 
operation could return a value unrelated to heap content within the convergence period. A 
plausible stabilizing heap implementation is one that resets the heap content to be empty 
whenever some inconsistency is detected during the processing of an operation (resetting the 
heap amounts to establishing a legitimate "initial" state for subsequent operations). 

4 Discussion 

The heap construction presented here satisfies desired availability properties: success or fail- 
ure in an operation response is a reliable indication of the operation's result on the data 
structure. We have not addressed the issue of relating heap damage to the extent of a fault 
— if a fault somehow sets root.val = 00 then the entire heap contents are lost by our con- 
struction. Our intent is to separate concerns by first developing a stabilizing heap, and then 
later adding logic for limited cases of corrupted items. This is a topic for future work. 
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